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Abstract 

Planning research in Artificial Intelligence (Al) has often focused on problems where there are cascading levels of action 
choice and complex interactions between actions. In contrast. Scheduling research has focused on much larger problems 
where there is little action choice, but the resulting ordering problem is hard. In this paper, we give an overview of AI 
planning and scheduling techniques, focusing on their similarities, differences, and limitations. We also argue that many 
difficult practical problems lie somewhere between planning and scheduling, and that neither area has the right set of tools 
for solving these vexing problems. 

1 The Ambitious Spacecraft 

Imagine a hypothetical spacecraft enroute to a distant planet. Between propulsion cycles, there are time windows when 
the craft can be turned for communication and scientific observations. At any given time, the spacecraft has a large set 
of possible scientific observations that it can perform, each having some value or priority. For each observation, the 
spacecraft will need to be turned towards the target and the required measurement or exposure taken. Unfortunately, 
turning to a target is a slow operation that may take up to 30 minutes, depending on the magnitude of the turn. As a 
result, the choice of experiments and the order in which they are performed has a significant impact on the duration of 
turns and, therefore, on how much can be accomplished. All this is further complicated by several things: 

• There is overlap among the capabilities of instruments, so there may be a choice to make for a given observa- 
tion. Naturally, the different instruments point in different directions, so the choice of instrument influences 
the direction and duration of the turn. 

• Instruments must be calibrated before use, which requires turning to one of a number of possible calibration 
targets. Recalibration is not required if successive observations are made with the same instrument. 

• Turning uses up limited fuel and observations use power. Power is limited but renewable at a rate that depends 
on which direction the solar panels are facing. 

Given all of this, the objective is to maximize scientific return for the mission or at least to use the available time wisely. 

Of course, this problem is not hypothetical at all. It occurs for space probes like Deep Space One, planetary rovers like 
Mars Sojourner, space-based observatories like the Hubble Space Telescope, airborne observatories like KAO and SO- 
FIA, and even automated terrestrial observatories. It is also quite similar to maintenance planning problems, where 
there may be a cascading set of choices for facilities, tools, and personnel, all of which affect the duration and possible 
ordering of various repair operations. 

What makes these problems particularly hard is that they are optimization problems that involve continuous time, re- 
sources, metric quantities, and a complex mixture of action choices and ordering decisions. In AI, problems involving 
choice of actions are often regarded as planning problems. Unfortunately, few AI planning systems could even repre- 
sent the constraints in the above problem, much less perform the desired reasoning and optimization. While scheduling 
systems would have an easier time representing the time constraints and resources, most could not deal with the action 
choices in this problem. In a sense, these problems lie squarely in between planning and scheduling, and it is our con- 
tention that many important practical problems fall into this area. 
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In this paper, we provide an introduction to prominent AI planning and scheduling techniques, but from a rather critical 
perspective - namely, we examine the prospects of these techniques for helping to solve the kind of problem introduced 
above. We start with traditional planning techniques in Section 2 and move to scheduling in Section 3. We have not 
attempted to cover the vast array of scheduling techniques developed within the OR community. Instead we limit our 
discussion to techniques developed and used within the AI scheduling community. In Section 4, we return to planning 
again and examine recent work on extending planning techniques to provide the capabilities often found in scheduling 
systems (and desperately needed for our spacecraft problem). 

2 Planning 

Open any recent AI textbook [ 1 18, 63, 117, 57] and you find a chapter on planning. Yet, it is difficult to find a succinct 
definition of planning, independent of the particular formalisms and algorithms being used. Fundamentally, planning 
is a synthesis task. It involves formulating a course of action to achieve some desired objective or objectives. In the 
most general sense, a course of action could be any program of actions and might include such things as conditional 
branches, loops, and parallel actions. In practice, the form is often restricted to simple sequences or partial orderings 
of actions. The objective in a planning problem can encompass many things, including achieving a set of goals , instan- 
tiating and performing an abstract task, or optimizing some objective function. 

This definition of planning is very general and encompasses several specialized types of problems, including motion 
or path planning, assembly sequence planning, production planning, and scheduling. Yet, work in planning seems to 
have little connection with work that typically goes on in these more specialized areas. The reason is that the focus of 
work in AI planning has been very different. To a large extent, AI planning work has concentrated on problems that 
involve cascading levels of action selection with complicated logical interactions between actions. In addition, most 
work in planning has made strong assumptions about time, resources, the nature of actions and events, and the nature 
of the objective. We will talk more about these assumptions later in this section 

Much of the planning work in AI has fallen into one of three camps. Classical Planning, Hierarchical Task Network 
(HTN) planning, and Decision-theoretic Planning. (Two other categories, case-based planning and reactive planning, 
will not be considered here, because they are less relevant to the solution of combinatorial optimization problems like 
the spacecraft problem.) For each of these approaches, there are survey articles that describe the history and techniques 
in greater detail [133, 134, 34, 21, 24]. We do not attempt to duplicate all this material here. In the following sections, 
we will give a brief introduction to these three approaches, concentrating on the relationships and shortcomings of the 
approaches with respect to the ambitious spacecraft. 

2.1 Classical Planning 

2.1.1 Representation 

Over the last 30 years, much of the work done on planning falls loosely into what could be called the classical planning 
paradigm. In a classical planning problem, the objective is to achieve a given set of goals, usually expressed as a set of 
positive and negative literals in the propositional calculus. The initial state of the world, referred to as the initial con- 
ditions , is also expressed as a set of literals. The possible actions are characterized using what are known as STRIPS 
operators. A STRIPS operator is a parameterized template for a set of possible actions. It contains a set of preconditions 
that must be true before the action can be executed and a set of changes or effects that the action will have on the world. 
Both the preconditions and effects can be positive or negative literals. 

Consider our spacecraft example from the introduction. Using STRIPS operators, we might model a simplified version 
of the action of turning the spacecraft to a target as: 

Turn (?target): 

Preconditions: Po4ntlng(?dlrection), ?dlrectlon # Ttarget 

Effects: -^ointfng(?dlrectk)n) l Pointing(?target) 

This operator has one parameter, the target orientation for the turn. The preconditions specify that before a turn can be 
performed the spacecraft must be pointing in some direction other than the target direction. If this is the case, following 
the turn operation, the spacecraft will no longer be pointing in the original direction and will instead be pointing in the 
target direction. Note that the effects of the operator only specify those things that change as a result of performing the 
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operator. The status o f the spacecraft cameras is not changed hy the operation, nor is the status of any other proposition 
not explicitly mentioned in the effects of the operator. 

Similarly, we might model the two operations for calibrating an instrument and taking an image of a target as: 
Calibrate (?instrument): 

Preconditions: Status(? instrument, On), Calibration-Target(?target), Pointing(?target) 

Effects: ^Status(?instrument. On), Slatus(?instrument, Calibrated) 

Takelmage (?target, ?instrument): 

Preconditions: Status(?instrument, Calibrated). Pointing (?target) 

Effects: lmage(?target) 

As originally defined, both the preconditions and effects for STRIPS operators were limited to being a conjunctive list 
of literals. However, a number of more recent planning systems have allowed an extended version of the STRIPS lan- 
guage known as ADL [ 104}. ADL allows disjunction in the preconditions, conditionals in the effects, and limited uni- 
versal quantification in both the preconditions and effects. Quantification and conditionals in the effects of an action 
turn out to be particularly useful for expressing things like “all the packages in a truck change location when the truck 
changes location”. 

There are, however, a number of more serious limitations with the STRIPS and ADL representations: 

Atomic time. There is no explicit model of time in the representation. One cannot specify the duration of an 
action or specify time constraints on either goals or actions. In effect, actions are modeled as if they were 
instantaneous and uninterruptable, so there is no provision for allowing simultaneous action or represent- 
ing external or exogenous events. 

Resources. There is no provision for specifying resource requirements or consumption. We cannot say that turn- 
ing the spacecraft uses up a certain amount of fuel, or that taking an image uses a certain amount of power 
or data storage. 

Uncertainty. There is no ability to model uncertainty. The initial state of the world must be known with certainty, 
and the outcomes of actions are assumed to be known with certainty. 

Goals. The only types of objectives that can be specified are goals of attainment. It is not possible to specify a 
goal involving the maintenance of a condition or achievement of a condition by a deadline. In part, this is 
due to the fact that there is no explicit model of time. There is also no ability to specify a more general 
objective that involves optimization. In classical planning, optimization is generally assumed to be unim- 
portant - simply finding a plan is good enough. 

At one time or another, various researchers have extended the STRIPS and/or ADL representations to address some of 
these limitations. In particular, [131, 107, 125] have incorporated richer models of time into the STRIPS representa- 
tion, [107, 84, 82] allow various types of resources, and [1 10, 47, 109, 86, 41, 115, 67, 124, 135, 66, 65] introduce 
forms of uncertainty into the representation. [131, 69, 138, 137] introduce more interesting types of goals, including 
maintenance goals and goals that involve deadlines. All of these extensions require extending the techniques for solv- 
ing classical planning problems, and these extensions typically have significant, detrimental impact on performance. 

2*1.2 Techniques 

A number of different techniques have been devised for solving classical planning problems. It would be impossible 
to cover them in much detail here. Instead, we give a brief sketch of the most significant approaches, concentrating on 
the central ideas, advantages, and disadvantages of the approaches. 

Forward State Space Search 

The most obvious and straightforward approach to planning is forward state space search (FSS). The planner starts 
with the world state consisting of the initial conditions, chooses an operator whose preconditions are satisfied in that 
state, and constructs a new state by adding the effects of the operator and removing any proposition that is the negation 
of an effect. Search continues until a state is found where all the goals are attained. Figure 1 gives a sketch of this al- 
gorithm. 

The trouble with forward state space search is that the number of permissible actions in any state is often very large, 
resulting in an explosion in the size of the search space. In our spacecraft example, there are many instruments, switch- 
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FSSfcurrent-state, plan) 

1. If goals c current-state then return(plan) 

2. Choose an action A (an instance of an operator) with preconditions satisfied in current-state 

3. If no such actions exist, then fail. 

4. Construct the new state s’ by: 

removing propositions in current-state inconsistent with effects of A 
adding the effects of A into current-state 

1. FSS(s\ plan J A) 

Figure 1: A non-deterministic forward state space search algorithm. The algorithm is initially called with the initial state 
and an empty plan. The statement beginning with choose is a backtrack point. 

es, and valves that can be activated at any given moment. In addition, there are an infinite number of different directions 
that the spacecraft could turn towards. In order for FSS search to be successful, strong heuristic guidance must be pro- 
vided so that only a small fraction of the search space is explored. For this reason, most planning systems have used a 
much different search strategy. However, there are two recent and notable exceptions: in [8, 7], Bacchus uses formulas 
in temporal logic to provide domain specific guidance for a forward state space planning system, and Geffner [23] au- 
tomatically derives strong heuristic guidance by first searching a simplified version of the space. Both planning systems 
have exhibited speed that is competitive with the most successful techniques described below, although the resulting 
plans are often considerably longer than the plans generated by those other techniques. 

Goal-directed Planning 

Until recently, most classical planning work has focussed on constructing plans by searching backwards from the 
goals. The basic idea is to choose an action that can accomplish one of the goals and add that action to the nascent plan. 
That goal is removed from the set and replaced by subgoals corresponding to the preconditions of the action. The whole 
process is then repeated until the subgoals that remain are a subset of the initial conditions. Like the forward state space 
approach, this approach is relatively simple as long as the nascent plan is kept totally ordered. However, if the actions 
are left only partially ordered, some additional bookkeeping is necessary to make sure that the unordered actions do 
not interfere with each other. This algorithm is sketched in Figure 2. 

Subgoal(goals, constraints, plan) 

1. If constraints is inconsistent, fail 

L If goals c initial-conditions then return(plan) 

2. Select a goal g € goals 

3. Choose an action A (an instance of an operator) with g as an effect 

4. Choose: 

Subgoal(goals-g+preconditions(A), new-constrai fits, plan+A) 

If A is already in the plan, SubgoaJ(goals~g, new-constraints, plan) 

Figure 2: A non-deterministic goal-directed planning algorithm. The algorithm is initially called with the goals 
and an empty plan. Statements beginning with Choose are backtrack points. Note that in step 4, even if a suitable 
action exists in the plan, adding another copy must still be considered as an alternative in case sharing produces 
conflicts. 

Various strategies have been developed for doing the bookkeeping mentioned above [ 1 27, 29, 90, 79]; the most widely 
used being the causal-link approach popularized by McAllester [90]. Along with a plan, a set of causal links is main- 
tained indicating propositions that must be preserved in between certain actions in the plan. Thus, when an action is 
added to the plan to establish a particular subgoal, a causal link is also added to make sure that the subgoal is preserved 
between the establishing action and the action that needed the subgoal. In our spacecraft example, suppose that our 
goal is to have an image of a particular asteroid, Image(Asteroid), and the plan already contains the action, Takelm- 
age(AsteroW, Camera), for taking the image. We therefore have a subgoal, Pointed(Asteroid), of having the spacecraft 
pointed in the direction of the asteroid. To accomplish this subgoal, the planner adds the action, turn(asteroid), of tum- 
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ing the spacecraft to point at the asteroid. Along with this turn action, the planner would add the causal link to preserve 
the proposition pointing(asteroid) in between the turn action and the action of taking the image. 

Causal links must be checked periodically during the planning process to make sure that no other action can threaten 
them. In that case, additional ordering constraints are imposed among the actions in order to eliminate the threat. For 
the spacecraft, a second turn action (to a different heading) would threaten the causal link for the first heading. This 
turn action would, therefore, have to come either before the first turn action or after the Takelmage action. 

Planners that make use of this approach are often called partial order causal link ( POCL) planners. Good introductions 
to POCL planning can be found in (133, 1 18, 34|. 

Despite ail the effort invested in goal-directed classical planning (particularly POCL planning), these planners have 
not met with much practical success. Although the branching factor is typically lower when searching backwards from 
goals (rather than forwards from initial conditions), it is still big enough, and there is considerable bookkeeping in- 
volved. As a consequence, the success of goal-directed planning depends just as heavily on strong heuristic guidance 
as FSS planning. Interestingly enough, guidance often seems to be more difficult to express in the goal directed search 
paradigm. Without such guidance, such planners have generally been limited to solving problems that require on the 
order of a dozen actions. 

POCL planning has been extended beyond the classical planning paradigm. The most well known and widely distrib- 
uted POCL planner, UCPOP [106], handles operators with quantified condition effects, along with other features of 
the ADL language. POCL planners have also been constructed that can handle time [131, 107], metric quantities [107], 
and uncertainty [110, 47, 115, 67, 86, 41, 66, 65]. Unfortunately, these planners have generally been unable to solve 
problems involving more than a handful of actions. 

Graphplan 

In 1995, Blum and Furst introduced a planning system, called Graphplan [18, 19], that employs a very different ap- 
proach to searching for plans. The basic idea is to perform a kind of reachability analysis to rule out many of the com- 
binations and sequences of actions that are not compatible. Starting with the initial conditions, Graphplan figures out 
the set of propositions that are possible after one step, two steps, three steps, and so forth. For the first step, this set will 
contain the union of the propositions that hold in states reachable in one step from the initial conditions. For our space- 
craft example, after one step, an image could be taken in the starting direction, or the spacecraft could be pointed in a 
new direction. So, all of these propositions are in the reachable set after one step. After two steps, in addition to all of 
the propositions possible after one step, an image could be taken in any of the new directions, or a data link could be 
established with Earth (if the spacecraft were pointed at Earth in step 1). After three steps, data could be transmitted 
back to Earth. 

However, not all of these propositions are compatible; even if we permitted concurrent action, not all of the reachable 
propositions could be achieved at the same time. For the spacecraft, we cannot take an image in the original direction 
and turn simultaneously, because the action of taking the image requires that the spacecraft remain pointed in the orig- 
inal direction. Likewise, the spacecraft cannot turn towards the asteroid and turn to face Earth at the same time. Graph- 
plan captures this notion of incompatibility by inferring binary mutual exclusion (mutex) relationships between 
incompatible actions and between incompatible propositions. The rules for mutual exclusion are remarkably simple: 

• Two actions are mutex at a given step if either: 

- they have opposite effects 

- an effect of one is opposite a precondition of the other 

- they have mutex preconditions at that step 

• Two propositions are mutex at a given step if either: 

- they are opposite literals 

- if all actions that give rise to them are mutex at the previous step 
Using these rules, Graphplan can infer the following: 

L After one step it is not possible to be oriented towards Earth and be oriented towards the asteroid. 

2. After two steps it is not possible to have the asteroid image and be oriented towards Earth, nor is it possible to 
have the asteroid image and have a communications link established. 
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3. After three steps it is still not possible to have both the asteroid image and have a communications link estab- 
lished. 

Using this simple mutex information, the goal of having the asteroid image transferred back to Earth cannot be attained 
until after five steps. As a result, Graphplan would not bother to actually search for a plan until proposition level six of 
the plan graph had been constructed. The key features of the first two levels of the plan graph are illustrated in Figure 3 . 
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Figure 3: Key features of the first two levels of the plan graph for a spacecraft example. Thick vertical arcs between pairs of 
propositions and between pairs of actions indicate mutex relationships. Many additional actions, propositions, and mutual 
exclusion relationships have been omitted at each level for clarity. 

Graphplan has been shown to significantly outperform the previously discussed POCL planning systems on a wide 
range of problems. In the recent planning systems competition held at AIPS-98 [91], all but one of the competing plan- 
ning systems were based on Graphplan techniques. Roughly speaking, the best planners using this technology have 
been limited to problems with less than about 50 actions. 

As with POCL planning, the Graphplan technique has been extended to handle operators with quantified conditional 
effects and other features of ADL [56, 85, 5]. Attempts have been made to extend Graphplan to allow reasoning under 
uncertainty [124, 135, 20], but these efforts have, so far, not proven to be very practical. There are more recent efforts 
to allow certain limited consideration of time [125] and metric quantities in Graphplan [84]. It remains to be seen how 
far these ideas will go and how well the techniques will scale. 

A more detailed introduction to Graphplan and extensions of Graphplan can be found in [134]. 

Planning as Satisfiability 

The basic idea behind planning as satisfiability is to guess a plan length, translate the planning problem into a set of 
propositional formula, and try to solve the resulting satisfiability (SAT) problem. If the formula is unsatisfiable, the 
length is increased, and the process is repeated. A number of different encoding schemes have been studied to date [81, 
43, 80], but the basic idea in most of these schemes is to have a propositional variable for 

• each possible action at each step 

• each possible proposition at each step 

Each action variable indicates the presence or absence of the action at that step in the plan. Each proposition variable 
indicates whether or not that proposition is true at that step in the plan. SAT clauses are generated for each of the fol- 
lowing constraints: 


Initial Conditions 

Goals 

Actions 

Causality 

Exclusion 


The propositions in the initial conditions are true at step 0. 

The goals are true at the last step. 

Each action occurring at step k implies that all of its preconditions are true at step k, and all 
of its effects are true at step k+ 1 . 

If a proposition is false (true) at step k and true (false) at step k+1, then at least one of the 
actions that can cause the proposition to become true (false) must have occurred at step k. 
Two incompatible actions cannot occur at the same step. 
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A number of additional tricks can be employed to reduce the number of variables and clauses, particularly in cases 
where actions have multiple arguments [43, 80, 1 34 1. 

After translation is performed, fast simplification algorithms, such as unit propagation and pure literal elimination, are 
used to shrink the formulas. Systematic or stochastic methods can then be used to search for solutions. 

The best SAT planners and Graphplan-based planners have very similar performance. Both significantly outperform 
POCL planners on most problems. Like POOL and Graphplan planners, SAT planners can be extended to allow oper- 
ators with quantification and conditional effects. In fact, this extension only impacts the translation process, not the 
solution process. Metric quantities like fuel present more serious difficulties. Wolfman [ 1 39 1 handles metric quantities 
by using linear programming in concert with SAT planning techniques. We will discuss this further in Section 4.2. 

The most serious disadvantages of the SAT planning approach are: 

Encoding size. The number of variables and clauses can be very large because all possible actions and proposi- 
tions are represented explicitly for each discrete time point. As a result, SAT planners often require huge 
amounts of memory (gigabytes) for even modestly sized problems. 

Continuous Time. The encoding described above is limited to discrete time and, therefore, cannot deal with 
actions that have varying durations or involve temporal constraints. An alternative causal encoding [80] 
could be used instead, but so far this encoding has not proven very practical or efficient. 

A good introduction to SAT planning can be found in [134]. 

2.2 HTN Planning 

Virtually all planning systems that have been developed for practical applications make use of Hierarchical Transition 
Network (HTN) planning techniques [136, 128, 102]. The basic difference between HTN planning and classical plan- 
ning is that HTN planning is about reducing high-level tasks down to primitive tasks , while classical planning is about 
assembling actions to attain goals. In HTN planning, the objective is usually specified as a high-level task to be per- 
formed, rather than as a conjunction of literals to be attained. For example, the spacecraft objective of having an image 
of a particular asteroid might be specified as the high-level task, Obtainlmage(Asteroid). Planning proceeds by recursively 
expanding high-level tasks into networks of lower level tasks that accomplish the high-level task. The allowed expan- 
sions are described by transformation rules called methods. Basically, a method is a mapping from a task into a partially 
ordered network of tasks, together with a set of constraints. In the spacecraft example, a possible method for instanti- 
ating Obtainlmage tasks is shown in Figure 4. According to this method, Obtainlmage(?target, ?instrument) can be replaced 


Obtainlmage(?target, ?instrument) 


♦ 



Figure 4: Simple decomposition method for obtaining an image. 

with a partially ordered network of three tasks, lbm(7target), Calibrate, and Takelmage(?target, ?instrument), with an addi- 
tional constraint that the proposition Potntlng(?target) must be maintained in between turning and taking the image. After 
each expansion, an HTN planner looks for conflicts among tasks in the network. Conflicts are resolved using critics 
that typically impose additional ordering constraints and combine or eliminate overlapping actions. Planning is com- 
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pletc when the resulting network contains only primitive tasks and the set of constraints is consistent. Figure 5 shows 
a simple pseudo-code algorithm for HTN planning. 

HTN-Plan (N) 

1 . If N contains conflicts, 

2. If there is no way to resolve the conflicts then fail 

3. Else choose a way of resolving the conflicts and apply it 

4. If N contains only primitive tasks, retum(N) 

5. Select a non-primitive task t in N 

6. Choose a method t — ► E for task t 

7. N 1 Replace t with E in N 

8. HTN-Plan(N') 

Figure 5: The basic HTN decomposition procedure for a task network N. Choose indicates a backtrack 
point. 

Time and metric quantities do not present as much difficulty for HTN planning systems. These constraints can be spec- 
ified within the methods and can be checked for consistency along with ordering and protection constraints. In fact, 
several existing HTN planning systems provide the capability for time and metric constraints [136, 128]. It is also rel- 
atively easy to combine an HTN planning system with an underlying scheduling system - once a task network has been 
reduced to primitive tasks, a scheduling system can be used to optimize the order of the resulting network. 

The greatest strength of HTN planning is that the search can be tightly controlled by careful design of the methods. In 
classical planning, the preconditions and effects of an action specify when the action could be used and what it could 
be used to achieve. In HTN planning, methods specify precisely what combinations of actions should be used for par- 
ticular purposes. In a sense, an HTN planner is told how to use actions, while a classical planner must figure this out 
from the action description. 

There are three principle criticisms that have been levied against HTN planning: 

Semantics. Historically, many HTN planning systems have not had clearly defined semantics for the decomposi- 
tion methods or for the system behavior. As a result, it has been difficult to judge or evaluate such things as 
consistency and completeness. Although some systems still suffer from a lack of rigor, there have been 
recent efforts by Erol [45, 46] and others [140, 78, 13] to provide a clear theoretical framework for HTN 
planning. 

Engineering. In general, it is difficult to develop a comprehensive set of methods for an application. One must 
anticipate ail the different kinds of tasks that the system will be expected to address and all useful ways 
that those tasks could be accomplished. Methods must then be developed to cover all of those possibilities. 

If there are many different kinds of tasks, and/or many different ways in which tasks can be achieved, this 
becomes a daunting engineering task. Changes to the domain can also be problematic; e.g. if new instru- 
ment capabilities are added to our spacecraft, a whole host of new methods may be required to use those 
capabilities, even if those capabilities overlap those provided by existing instruments. 

Brittleness. HTN planners are often seen as brittle, because they are unable to handle tasks that were not explic- 
itly anticipated by the designer, even if the available primitive actions are sufficient for constructing a suit- 
able plan. 

There is no denying that most practical planning systems have used HTN techniques [136, 128, 102], and anyone cur- 
rently considering a serious planning application would be well advised to consider this approach. Nevertheless, many 
researchers are dissatisfied with HTN planning because it is closer to “programming” a particular application, rather 
than providing a declarative description of the available actions and using general techniques to do the planning. 

There is no comprehensive overview article that describes different HTN planning systems and techniques, but a clear 
introduction to the basic mechanisms of HTN planning can be found in [45]. 
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2.3 MDP Techniques 

For many years, researchers in Operations Research and Decision Sciences have modeled sequential decision problems 
using Markov Decision Processes (MDPs). In the last ten years, there has been growing interest within the AI commu- 
nity in using this technology lor solving planning problems that involve uncertainty. 

Basically, an MDP is a state space in which transitions between states are probabilistic in nature. For example, suppose 
that the imager in the spacecraft has a sticky shutter that sometimes fails to open. Thus, when the spacecraft attempts 
to take an image in a particular direction, it may or may not get one. Part of the MDP for this scenario is illustrated in 
Figure 6. 



Figure 6: A fragment of the spacecraft MDP. 

MDPs are traditionally solved using powerful techniques called value- iteration and policy-iteration [116]. These tech- 
niques find optimal policies for an MDP, which amount to conditional plans that specify which action should be taken 
in every possible state of the MDP. 

The principal difficulty with using MDPs for planning has always been the size of the state space. If the spacecraft 

50 

contains 50 switches, each of which can be either on or off, then there are 2 different possible states just for the 
switches alone. As a result, much of the AI work in this area has concentrated on limiting the size of the state space by: 

* using more compact representations that exploit the fact that, typically, many propositions and actions are in- 
dependent or nearly so. 

• using approximation techniques that expand only the most probable and seemingly useful portions of the state 
space. 

These techniques have been successfully used to solve planning problems in certain carefully circumscribed domains. 
In particular, they have proven useful in robot navigation tasks, where there is uncertainty in the robot’s location and 
orientation after moving [33]. The size of the state space is still a significant obstacle in broader application of these 
techniques. To be fair, alternative approaches (extensions of POCL, Graphplan, and SAT techniques) for planning un- 
der uncertainty have not proven very practical either. 

Apart from the issue of state space size, there are some other significant limitations of the MDP framework: 

Complete Observability. The MDP framework assumes that, after performing an action with an uncertain out- 
come, the agent (our spacecraft) can observe the resulting state. This is not a viable assumption in a world 
where machines have limited sensors and sensing is costly. Such problems can be represented in Partially 
Observable Markov Decision Processes (POMDPs), but the representation and solution techniques have 
generally not proven tractable for anything beyond very tiny problems. 

Atomic Time. There is no explicit model of time in the representation. Actions are modeled as if they were dis- 
crete, instantaneous, and uninterruptable. Allowing concurrent action or exogenous events results in a fur- 
ther explosion in the size of the state space. 

Goals* It is somewhat difficult to express goal attainment problems in the MDP framework. In general, they must 
be modeled as infinite horizon problems or as a succession of longer and longer finite horizon problems. 
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Another issue is that optimal policies are often large and difficult to understand. It humans must examine, understand, 
and carry out plans, then it is better to have a simpler, more compact plan that covers only the most critical or likely 
contingencies. Limited contingency planning has been explored within the classical planning framework [42 1, but it is 
not yet clear how to do it within the MDP framework. 

Since our spacecraft example did not include uncertainty, the primary strength of the MDP approach is not needed for 
this problem. In general, optimization is more easily handled in the MDP approach than in the classical planning or 
HTN approaches; however, in this case, the inability to model time is more serious. The order in which observations 
are performed influences the duration of the turns, which in turn influences the completion time for observations. It is 
not obvious how to do this in the MDP framework without dramatically increasing the size of the search space. 

Introductions to the use of MDP techniques in planning can be found in [21, 24]. Many of the issues and limitations 
listed above are discussed, along with recent work on overcoming those limitations. 

3 Scheduling 

Scheduling did not receive serious attention in AI until the early 1980s when Fox et al. began work on the ISIS con- 
straint-directed scheduling system [50]. Since that time, a growing number of AI researchers have been working in the 
area. The common conception of scheduling in AI is that it is a special case of planning in which the actions are already 
chosen, leaving only the problem of determining a feasible order. This is an unfortunate trivial ization of scheduling. 
Two well known OR textbooks on the subject [11, 112] define scheduling as the problem of assigning limited resources 
to tasks over time to optimize one or more objectives. There are three important things to note about this definition. 

• Reasoning about time and resources is at the very core of scheduling problems. As we noted earlier, these is- 
sues have received only limited attention within the AI planning community. 

• Scheduling problems are almost always optimization problems. Often it is easy to find a legal schedule by just 
stretching out the tasks over a long period. Finding a good schedule is much tougher. 

• Scheduling problems also involve choices. Often this is not just confined to choices of task ordering but in- 
cludes choices about which resources to use for each given task. For a given task, several alternative resources 
may be available that have differing costs and/or durations. 

The types of choices in a scheduling problem can also extend beyond ordering and resource choices. Alternative pro- 
cesses may be available for some steps in a scheduling problem. For example, it might be possible to either drill or 
punch a hole, and these two possibilities require different machines and have different costs. Tasks can also have setup 
steps - optional steps that may need to be performed before the task. In our spacecraft example, instrument calibration 
could be regarded as a setup step, because we only need to do it once for any sequence of observations using the same 
instrument. 

Given that scheduling problems can involve choices among resources and perhaps even process alternatives, what dis- 
tinguishes scheduling from the general definition of planning we gave in the beginning of Section 2? The difference is 
a subtle one: scheduling problems only involve a small, fixed set of choices, while planning problems often involve 
cascading sets of choices that interact in complex ways. In a scheduling problem, the set of tasks is given, although 
some tasks may be optional and some may allow simple process alternatives. In a planning problem, it is usually un- 
known how many tasks or actions are even required to achieve the objective. 

There is a vast literature on scheduling in OR (see [II, 26, 112, 113] for introductions). What distinguishes most AI 
work on scheduling from the OR work is that AI work tends to focus on general representations and techniques that 
cover a range of different types of scheduling problems. In contrast, OR work is often focused on developing optimized 
techniques for specific classes of scheduling problems tike flow-shop, job-shop, and sports team scheduling problems. 
As an example, consider a recently published text on scheduling [26]. In this text, scheduling problems are taxono- 
mized according to roughly 10 features; each feature having between 2 and 10 values. Specific solution approaches are 
then presented for many of these problem classes, often based on specialized representations and algorithms. In addi- 
tion, some of the algorithms and the corresponding computational complexity results depend directly on problem de- 
tails, such as the number of available resources. 

The purpose of this section is not to examine the multitude of different methods for solving particular classes of sched- 
uling problems. Instead, we concentrate on approaches to general scheduling problems and discuss how these general 
approaches differ from both AI planning techniques and the more traditional OR scheduling techniques. To better 
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ground the overview of general AI scheduling techniques, let us start out by focusing on a fairly general class of sched- 
uling problems known as resource-constrained project scheduling (RCPS) problems. Later, we show how the presented 
approaches can be extended to more complex choices, such as resources with different cost/duration functions and set- 
up steps. An instance of an RCPS problem consists of: 

• A finite set of tasks, each of a certain duration. 

• A finite set of resources, each having a specified capacity. 

• A specification of how much of each resource each task requires (may be none at all). 

• A set of ordering constraints on the tasks. 

3.1 Representation 

In AI, the most common approach to solving a scheduling problem is to represent it as a constraint satisfaction problem 

(CSP) and to use general constraint satisfaction techniques (16, 32, 1 26]. 3 A constraint satisfaction problem formally 
specifies a set of decisions to be made and a set of constraints that limit which combinations of decisions are valid. The 
decisions are described in terms of variables, each of which can be assigned a value from its domain of values. The 
constraints are described in terms of relations that specify which combinations of value assignments are valid for the 
participating variables. 

Two main approaches have been explored in modeling scheduling problems as constraint satisfaction problems. The 
distinguishing factor is which decisions are being made when constructing a schedule. The possibilities are: 

• Assign start times to each task so that ail given time and resource constraints are satisfied. 

• Impose ordering constraints among tasks so that all the given time and resource constraints are satisfied. 

In this section, we first give a brief overview of these two core approaches and their strengths and weaknesses. We then 
go on to discuss how these two approaches map into different constraint representation techniques, including constraint 
optimization problems. 

3.1.1 Selecting Start Times 

A very natural way of representing RCPS problems as constraint satisfaction problems is to: 

• Define a variable that represents the start time of each task. The variable’s domain consists of all possible start- 
ing times in a discretized interval defining the scheduling horizon. 

• Specify constraints enforcing the given task orderings; for example, if task A must come before task B, then 
the start of B must be no earlier than the start of A plus the duration of A. 

• For each timepoint and each resource, specify the constraint that the total usage of all tasks active at that point 
does not exceed the capacity of any resource. 

Decisions then take the form of assigning individual start times to tasks. 

Much of the initial work on scheduling as constraint satisfaction was done using this approach and, for certain appli- 
cations, it continues to be the favored representation. For complex resource problems, such as when capacity and usage 
change over time, this approach makes it possible to accurately determine the remaining resource availability for each 
point in time. However, there are also limitations that arise from fixing the exact time for each task and having the set 
of choices depend on the number of timesteps: 

Problem Size. The set of possible choices is unnecessarily large, since the real number of choices is much 

smaller than the set of all possible assignments of tasks to time points. This makes it difficult to search for 
solutions, both due to the sheer size of the search space and due to the number of operations needed to 
change the solution candidate significantly. 


3. The development of constraint representation and reasoning techniques has resulted in two closely related formalisms - con- 
straint satisfaction and constraint logic programming. Although there are some differences between the two, the core scheduling 
approaches and techniques are essentially the same. Therefore, we will present AI scheduling techniques in terms of the core con- 
cept of constraint satisfaction problems and disregard this distinction. 


December 22, 1999 


11 



Discretized Time. The approach depends on discretized time, making it necessary to define atomic timesteps 
before the problem can be solved. To make matters worse, the size of the representation depends on how 
time is discretized - using hours for measuring time results is a very different representation from using 
seconds. 

3.1.2 Ordering Tasks 

The other common representation is based on the idea that two tasks that are ordered will not compete for the same 
resource. Defining ordering variables for pairs of tasks, we get the following constraint representation: 

A Boolean variable for each ordered pair of tasks, representing that the first comes before the second. (Note 
that this gives rise to two ordering variables for each pair of tasks. If both of these variables are assigned “false” 
then the two tasks may overlap; assigning “true” to both variables is disallowed.) 

Constraints among ordering variables which both encode pre-existing ordering constraints and also enforce the 
proper order of tasks based on assignments to these variables. 

Constraints for determining the possible start times for each task, based on the ordering. 

Constraints enforcing that, if each task starts as early as possible, then no resource will be overused. 

This representation permits the use of constraint propagation (described in section 3.2. 1 ) to keep track of possible start 
time, since the only decisions made are those that order (or fail to order) operations. Keeping track of the possible start 
times for each task serves two purposes. First, any ordering that violates a given time bound can be identified when the 
set of possible start times for a task becomes empty. Second, keeping track of this derived information makes it possible 
to effectively determine whether resources are in danger of being over-subscribed or not and, thus, guarantee that the 
final schedule is valid. Keeping track of the possible start times is relatively straightforward. In fact, the most common- 
ly used approaches are based on the same principles as Ford’s algorithm [49] and other techniques from OR that de- 
termine the set of possible execution times for each task given a partial ordering of the tasks. 

The task-ordering approach goes a long way towards addressing the limitations outlined above. For almost any realistic 
scheduling problem, the resulting search space is significantly smaller. There are more decisions to be made (n 2 rather 
than n, where n is the number of tasks), but the number of options for each decision is much smaller (2 rather than T, 
where T is the number of timesteps). The task-ordering representation is also independent of time discretization, as 
there are algorithms that can keep track of the range of start times without using discretized time points (see for exam- 
ple [37]). This is because mapping a partial ordering into a set of possible start times is a question of arithmetic calcu- 
lations, not a question of selecting specific times. Consequently, minimal schedule length can be determined directly 
from a given partial ordering, without searching through alternative start time assignments. Finally, this representation 
can be very useful in situations where durations are uncertain, as a partial ordering provides much more flexibility than 
a fixed time-stamped schedule. 

When using an ordering- based encoding, ordering decisions need only be made until the system can guarantee that no 
resources are over-subscribed. However, as resource demands become more complex, verifying that resources are not 
over- subscribed becomes more challenging. Thus, despite its advantages, the task-ordering approach still has at least 
one significant weakness: 

Complex Resources. In cases where the resource requirements depend on when a task is scheduled, it is difficult 
to determine correctly whether or not the resource capacity is exceeded if only a bound on the task execu- 
tion time is known. 

Some progress has been made towards addressing this weakness for certain aspects of ordering-based scheduling. Ex- 
amples include an approach to handling changing resource capacities in an ordering-based framework [31] and tech- 
niques for effectively evaluating resource contention without the use of absolute time assignments [15]. The technique 
presented in [12] includes necessary conditions for the existence of a solution to an RCPS problem which can be used 
to prune the search space. The same paper also shows that these conditions can be used to adjust time-bounds and pre- 
sents a way to “lazily” impose ordering constraints that mimic the order-based encoding described above. 

3.1.3 Scheduling as Satisfiability 

In recent years, many AI researchers have studied satisfiability problems as an alternative to general constraint satis- 
faction problems. A satisfiability problem is a constraint satisfaction problem where each variable is a Boolean variable 
and each constraint is in the form of a disjunctive clause of literals, each of which represents a variable or the negation 
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<>l a variable. The work in (his area has resulted in the development of various fast and effective satisfiability algorithms 
that take advantage of the uniform structure ol such problems (for examples of recent developments see ( 14, 89, 8 1 ]), 
To take advantage of this progress, researchers have looked at solving scheduling problems as satisfiability problems. 
Just as for scheduling as general constraint satisfaction, the satisfiability approach requires scheduling problems to be 
translated into satisfiability problems. This process is significantly more involved since temporal information must be 
represented with Boolean variables, and all constraints must be specified as clauses. The details of mapping job-shop 
scheduling into satisfiability are given in a comparison study that applied a number of different satisfiability algorithms 
to the resulting problems (32). Some of these algorithms are discussed in the Section 3.2. 

It should, however, be noted that applications of satisfiability translations are limited; the representation of time is dis- 
creet. and it is difficult to effectively represent arithmetic relations and functions. Furthermore, there are indications 
that the expense of representing the temporal constraints as clauses negates any advantages provided by the faster al- 
gorithms [73|. 

3.1.4 Scheduling as Constraint Optimization 

Although scheduling problems are often optimization problems, the core constraint approaches we have described 
above do not provide a mechanism for representing and utilizing evaluation functions. However, this turns out to be a 
relatively easy problem to fix, as a constraint satisfaction problem is easily extended to a constraint optimization prob- 
lem AI1 that IS required is a function that maps a solution to a real- valued cost or score function! 108], Constraint op- 
timization problems have not been studied as extensively as regular CSPs, but the studies have, nonetheless, provided 
a number of useful techniques for solving COPs. Many of these methods are adaptations of constraint satisfaction 
methods, while others make use of earlier optimization methods like simulated annealing. Later, when we turn our at- 
tention to methods for solving constraint scheduling problems, we will return to the issue of optimizing schedules with- 
in the constraint reasoning framework. 

3.1.5 Extensions 

Various extensions to the RCPS problem have been explored in the AI scheduling community. The simplest extension 
is where there is a choice between resources, such as in multiple-machine job-shop scheduling. This can effectively be 
represented by a variable for each task-resource pair that indicates the choice of which resource the task will utilize. 
Within the constraint representation, it is relatively straightforward to make duration and other aspects of the task de- 
pend on the resource chosen and to make the resource demand depend on when the task is performed. Additionally, 
constraint-based approaches can handle other types of extensions, such as setup steps that may or may not need to oc- 
cur before a task. Examples of such setup steps include the need to calibrate a camera that has not been used for a cer- 
tain amount of time and the need to warm up an engine that has long been idle. Setup steps can be represented as 
optional tasks with zero duration/cost under certain circumstances and non-zero duration/cost under others. 

3.1.6 Limitations 

Although the constraint framework is very general and can represent complex scheduling problems, there is a limit to 
how far the paradigm can be extended. Consider all of the choices for our ambitious spacecraft; choosing to do a par- 
ticular experiment leads to a choice of which instrument to use, which leads to a choice of which calibration target to 
use. Furthermore, the order in which experiments are performed influences all of these choices. Representing this prob- 
lem as a CSP would require that every conceivable task, choice, and decision point would have to be represented ex- 
plicitly in the constraint network. Not only will this result in a large problem with hundreds, if not thousands, of 
variables and constraints, but each constraint will be quite complex. In effect, when there is a choice of tasks to per- 
form, the constraints involving those tasks must be made conditional on the presence or absence of the task. For exam- 
ple, consider the constraint that an experiment must fit within a particular time window; this constraint would seem to 
be representable with two inequalities. Taking into account the possible duration of turning, the possibility of having 
to calibrate the instrument and the choice of possible calibration targets, this simple pair of inequalities has been turned 
into a complex, conditional arithmetic constraint. 

In sum, as more choice is introduced into a scheduling problem, the number of variables becomes large and the con- 
straints become complex and cumbersome. As a result, this approach often becomes impractical for problems (like the 
spacecraft problem) that have many interacting choices. 
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3.2 Solving Constraint Satisfaction Problems 

There are a wide variety of algorithms for solving CSPs. but most fall into one of two categories. Constructive search 
strategies attempt to build a solution by incrementally making commitments to variables, checking the constraints, and 
backtracking when violations are found. Local search strategies begin with a complete assignment of values to the vari- 
ables and attempt to "repair’* violated constraints by changing the value of a variable in one of those constraints. In the 
following sections, we discuss these paradigms in more detail and compare the two paradigms. 

3.2.1 Constructive Search 

Figure 7 shows a simple constructive search algorithm for solving CSPs. At each level of the search, an unassigned 

Construed veSearch(CSP) 

1. If any constraint is violated then fail 

2. Else, if all variables are assigned values then retum(CSP) 

3. Select an unassigned variable v 

4. Choose a value vi for v 

5. Let CSP' = propagate( CSP u v = Vj ) 

6. ConstructiveSearch(CSP’) 

Figure 7: A simple constructive search algorithm for CSPs using chronological backtracking. 
Choose indicates a backtrack point. 

variable is selected, and the procedure tries each of the different possible values for that variable, calling itself recur- 
sively on the resulting instantiated version of the original CSP. If an assignment proves inconsistent, the procedure 
fails, and backtracking occurs. There are several components of this procedure: variable ordering , value ordering , 
propagation strategy , and backtracking strategy. The choices made for these components have dramatic impact on per- 
formance. There has been extensive research on these topics leading to significant improvements in the performance 
of CSP solvers. Because CSP techniques have served as the basis for most AI approaches to scheduling, we summarize 
the most significant developments next. 

Variable Ordering 

Variable ordering concerns the selection of which variable to work on next (Step 3). There is considerable debate about 
what makes for a good variable ordering strategy. The best known general heuristic is the Minimum Remaining Values 
(MRV) heuristic 4 [9, 59, 123, 55, 36, 70], which chooses the variable with the fewest remaining values. Basically, this 
strategy tries to keep the branching factor of the search space small for as long as possible. Propagation often aids this 
process by continually reducing the domains of the remaining variables [17]. In practice, MRV is inexpensive to com- 
pute and works very well on many CSP problems, at least compared to most other purely syntactic strategies [36, 55, 
59]. 

Unfortunately, for scheduling problems where the variables are ordering decisions, MRV does not help much. The rea- 
son is that there are n 2 ordering variables, all of which have two values. In this case, more powerful heuristics are need- 
ed to select ordering variables for tasks that share "bottleneck" resources. The slack-based heuristics developed by 
Smith and Cheng [ 1 26] choose ordering variables according to the width of task start windows and the overlap between 
windows. Sadeh [121] pioneered more elaborate resource profile analysis to determine which tasks are probabilistical- 
ly most likely to contend for bottleneck resources. The strategy is to select ordering variables for those tasks. Although 
these heuristics are expensive to compute, they appear to give much better advice for scheduling problems. 5 


4. Other authors have called this heuristic dynamic variable ordering [55], dynamic search rearrangement , and fail first [59, 123]. 

5. One way to interpret these heuristics is that they perform MRV on the "dependent" variables of the ordering variables, i.e. the 
domain of values remaining for the start times of jobs. 
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Value Ordering 

Value ordering concerns the choice of a value for a selected variable (Step 4). If the perfect choice is made at each level 
ot the search (and the problem has a solution), then a solution will be found without any backtracking. However, even 
if the value ordering strategy is not perfect, good value choices reduce the amount of backtracking required to find a 
solution. A common strategy for value ordering is to choose a value that causes the least restriction of the domains of 
the remaining uninstantiated variables, thereby increasing the chances that a solution will be possible. For scheduling 
problems, one approach is to choose the ordering for two tasks that leaves the greatest time flexibility for the tasks 
[ I26|. Another approach is to use analysis of resource profiles to choose orderings that provide the most reduction in 
the demand of critical resources, thereby reducing the chances of a resource conflict [121]. 

For CSP problems where solutions are hard to find, value ordering seems to be less critical than variable ordering. For 
such problems, much of the search time is spent investigating dead ends (i.e., subproblems that turn out to have no 
solution). Unfortunately, value ordering does not help if a problem (or subproblem) is unsolvable, because all values 
must be investigated to discover that fact. Order simply does not matter. Thus, for a value ordering strategy to be useful, 
it must be accurate enough to avoid a significant fraction of the dead ends. In contrast, for problems where many solu- 
tions are possible, value ordering is more important, because even a decent strategy can lead to a solution very quickly. 
In the case of optimization, a good value ordering strategy can lead to better solutions early, which in turn makes 
branch-and-bound techniques more effective. 

Propagation Strategies 

Propagation in CSP solvers (Step 5) is the mechanism for drawing conclusions firom new variable assignments. Prop- 
agation strategies range from simple, quick consistency checking strategies, like forward checking , to powerful but 
costly k-consistency strategies that can infer new k-ary constraints among the remaining variables [93]. In between 
these two extremes lie several useful strategies that prune the domains of uninstantiated variables. The best known of 
these strategies is arc-consistency [93], which examines individual constraints to eliminate any values of the partici- 
pating variables that will not satisfy the constraint, given the domains of the other participating variables. For example, 
suppose that both the variables x and y have domains { 1, 2, 3}, and suppose we have the constraint that x < y . By arc- 
consistency, the value 3 can be eliminated for x and the value I can be eliminated for y. In solving a CSP, choosing and 
instantiating a variable can cause a cascade of propagation using arc-consistency. The advantage of this is that con- 
structive search does not need to consider these eliminated variable assignments, thereby reducing search cost [119]. 
Arc -consistency has been extended in various ways to work with n-ary constraints and continuous variables. PERT 
chart analysis is a special case of the latter extension. 

Another form of propagation commonly used in scheduling applications is called edge-finding [27, 103] (see also [ 16] 
for a more general discussion). Edge-finding is similar to arc -consistency in that it reduces the legal domains for vari- 
ables. However, instead of dealing with a single constraint, edge-finding deals with a group of constraints that tie a set 
of tasks to a common resource. Figure 8 shows an example of four tasks. A, B, C, and D, that each require exclusive 
use of a common resource. Task A can start at any time within the interval [0, 4], while tasks B, C, and D can start 



Figure 8: Example of a task window that can be narrowed by edge-finding. 

within the interval [1,3]. If we consider any two (or three) of the tasks, none of the start times can be eliminated. How- 
ever, if we consider the four tasks together, it is possible to conclude that task A must start no earlier than at time 4; if 
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task A is scheduled before that, there is not enough room for the remaining tasks in the window [ l,4|. Edge-finding 
can significantly reduce the time required to solve scheduling problems [ I03|. 6 

For scheduling problems, as well as general CSPs. the simple, fast propagation strategies have generally proven to be 
e most useful. Forward checking, arc-consistency, edge-finding, and variations of these strategies are the most com- 
monly used- More complex consistency techniques have generally proven too costly in either time or storage require- 

Backtracking Strategies 

The procedure in Figure 7 performs chronological backtracking. When a variable assignment fails, it backs up one lev- 
e and tries another assignment for the preceding variable. If the possibilities for that variable are exhausted, it backs 
up another level and so on. Many constructive search algorithms for CSPs use more sophisticated backtracking strat- 
egies t at i entify the variables responsible for the failure and backtrack directly to the appropriate level. These strat- 
egies include backmarking [93|, backjumping, conflict-directed backjumping (CBJ) [100, 1 14), and dynamic 
backtracking [62, 14). Of these, CBJ has generally proven to be the best compromise between cost and power. 

It is worth noting that different backtracking techniques can also be used for constraint optimization problems. All of 
the techniques above can be combined with branch-and-bound style algorithms[IO] which prune search if the mini- 
mum cost of any satisfying solution exceeds a lower bound. The addition of branch-and-bound pruning provides more 
opportunity for pruning and permits propagation techniques and heuristics to take advantage of the optimization crite- 
ria* as well as the constraints* when doing their work. 

Not all CSP backtracking techniques are based on backtracking to a recent conflict; an extreme, but often effective, 
alternative is to tektnek ail the way to the first decision. The simplest such techniques, called iterative or stochastic 
sampling methods [88, 25], are based on starting over whenever a conflict is found. Slighdy more sophisticated tech- 
niques J2 68] a i 0 w for limited chronological backtracking before the search process is restarted. Limited discrepancy 
earc ( ) f I is an interesting approach that combines elements of traditional backtracking and sampling LDS 

uses a basic backtracking scheme to systematically sample the complete set of possible solutions, in order of increasing 
number ofdiscrepanc.es from the given heuristics. The intuition behind LDS is that the value heuristics are correct 
most of the time, so only a small number of deviations need to be made to find a solution. As a result, it is often more 

effective to search in order of increased number of discrepancies, rather than simply follow the structure of the search 
space. 

3.2.2 Local Search 

Local search techniques provide a very different way of solving CSPs than constructive search techniques. Local 
search typically begins with a full assignment of values to variables, even though some of the constraints may be vio- 
lated. The basic idea is to then gradually modify the assignment by changing the value of some or all of the variables 
in an attempt to move closer to a valid solution. This is done by repeatedly generating a “neighborhood” of new as- 
signments, ranking those assignments, and choosing the best one. Each of these actions is called a move. Since the 
e ectiveness o this approach is highly dependent on the initial assignment, a new initial assignment is often generated 
after a specified number of moves have failed to find a solution. Each of these cycles, consisting of generating an initial 
assignment and performing a number of moves, is called a try. After enough tries, without a solution being found, the 
process terminates with a failure indication. This process is shown in Figure 9. 

The sketch in Figure 9 describes a family of local search algorithms. To instantiate an algorithm, one must specify the 
procedures used to generate neighbors, rank the neighbors . and select the neighbor to move to next. The performance 
or local search depends heavily on these procedures. 

A neighbor is an assignment that is “nearby” the existing assignment; typically, this means an assignment that differs 
in the value of exactly one variable (although other variations have been explored). Neighborhood generation proce- 
ures differ in terms of how many and which neighbors they generate. One approach [95, 30] is to select a single vari- 

(Ntcvru? 1 rn , Nir l8C ‘| finding a P pearS t0 . close| y related to Freuder's notion of Neighborhood Inverse Consistency 

. T C ' ^ el,m ! nat ed for a variable by trying them in the reduced CSP consisting of all constraints 
connected to the variable, the variables in those constraints, and the constraints between tJIKSSSSrSS- 
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LocalSeurch(CSP) 

1 . Repeat for tries 

2. Generate an initial assignment. A, for the variables in CSP 

3. Repeat for moves 

4. If no constraints are violated return(A) 

5. Select a neighboring assignment. A’ 

6. Set A A' 


Figure 9: Local Search algorithm sketch. 

able that is in a violated constraint and then generate the set of assignments that only differ in the value for the selected 
variable. This approach generates a relatively small set of neighbors. A more thorough, but more expensive, approach 
is to generate neighbors for every variable that occurs in any violated constraint [30]. 

Once the assignments have been generated, they are ranked. Typically, the ranking is based on the number ot satisfied 
constraints [96, 30, 95], but the constraints can also be weighted according to how important they are to satisfy [108]. 
After the assignments are ranked, one of the neighbors is selected as the new current assignment. A common strategy 
is to select the highest ranking neighbor, but probabilistic selection strategies have also been employed [30]. Other 
forms of selection permit any improving state to be selected (see for example [58, 96]). 

Local search algorithms may suffer from problems with plateaus, local minima , and cycling . Plateaus are large regions 
of the search space in which no assignment improves the value of the objective function. In these regions, local search 
algorithms have little guidance and often wander aimlessly. Local minima are regions where all neighboring assign- 
ments are worse than the current assignment. Greedy selection heuristics can lead to cycling behavior in local minima, 
because a neighboring state is selected in one move and the local minima is selected again during the next move (see 
[52] for a discussion of these issues for satisfiability problems). 

The AI community has developed some techniques to address these problems. One method used to escape local min- 
ima and plateaus is random noise [120]; rather than always choosing the best neighboring assignment, a random neigh- 
bor is occasionally chosen instead. This approach has led to significant improvement for a number of local search 
methods. Random moves force local search away from local minima, cycles, and plateaus, permitting exploration of 
other parts of the search space, similar to the popular simulated annealing approach [83]. Another technique for escap- 
ing local minima, plateaus and cycles is to change the ranking of neighbors during search. This can be accomplished 
by either learning new constraints [28] or by increasing the penalty for violating certain constraints [5 1, 122]. The latter 
techniques essentially modify the ranking function in order to make the current assignment less attractive, leading the 
search algorithm to explore other parts of the search space. Tabu search [64] has a similar objective; it directly breaks 
cycling by preventing the return to a recently explored options. 

Empirical studies have shown that local search methods often solve problems faster than constructive search tech- 
niques. For this reason, these methods have been used for solving a number of large practical problems. There are, how- 
ever, two significant drawbacks to the use of local search: 

Completeness, With constructive search, if there is a solution, it will eventually be found. There is no such guar- 
antee for local search. In addition, a local search procedure can only fail to find a solution; it can never con- 
clude that there is no solution. This means that local search methods always take a long time before failing, 
even on very small unsolvable problems. 

Optimization. It is difficult to do optimization within a local search framework. One cannot simply combine the 
optimization criterion with a ranking function for choosing neighbors. If the optimization function domi- 
nates, the search process is prevented from finding valid solutions. On the other hand, if violated con- 
straints are made paramount, then the optimization function is limited to exploring solutions around the 
local minimum found by the search. One solution to this problem is to limit the local search neighborhood 
to only feasible solutions as in [6], but these are often difficult to characterize and the algorithm may be 
unable to find good solutions. Finally, it should be noted that repeated invocation of local search can gener- 
ate a set of solutions, from which the best can be chosen. However, this is often inefficient, as there is no 
apparent analog to branch-and-bound in local search. 
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For many scheduling problems, including our spacecraft problem, completeness is not a serious issue, as these prob- 
lems are generally rich in valid solutions. However, as we noted above, the goal is not to simply find a solution, but to 
find a good solution. As a result, the difficulties associated with combining neighborhood ranking and optimization are 
a significant problem in using local search for scheduling. 

4 Bridging the Gap 

As we noted in the introduction, the ambitious spacecraft problem exhibits characteristics of both planning problems 
and scheduling problems. Like scheduling problems, it involves time constraints, actions with differing durations, re- 
sources, metric quantities, and optimization. However, it also involves action choice - choosing to do a particular ob- 
servation leads to a choice of instrument to use, which leads to a choice of which calibration target to use. The choices 
for different observations also interact in complex ways. Since performing an observation leaves the spacecraft pointed 
in a new direction, this influences the choice of subsequent observations, instruments, and calibration targets. This kind 
of complex and cascading action choice is the principal issue that has been addressed in most planning research. 

In the preceding sections, we have discussed both AI planning and scheduling techniques with respect to the spacecraft 
problem. Most classical planning techniques are unable to represent or reason about resources, metric quantities, or 
continuous time. Many techniques also ignore optimization. As we hinted in Section 2.1.2, there have been attempts 
to extend classical planning techniques to treat resources [40, 87], metric quantities [107, 84, 82, 139], and to allow 
optimization criteria [82, 139, 130]. There have also been attempts to extend planning techniques to deal with contin- 
uous time and time constraints [131,4, 98, 107, 60, 74, 125]. In this section, we revisit the issues of resources, metric 
quantities, and continuous time, and we examine some of the recent attempts to extend planning techniques into these 
areas. 


4.1 Resources 

In the AI community, the term resource has been used to refer to a number of different things, from discrete sharable 
facilities (like a group of identical machines in a factory) to continuous quantities that are consumable and renewable 
(like fuel). Here we will use the term resource in the former sense: to refer to something like a machine, instrument, 
facility, or personnel that is used exclusively during an action, but is otherwise unaffected by the action. In our space- 
craft example, the attitude control system, the cameras, and many other instruments would be resources. Resources 
have received relatively little attention in the planning community, perhaps because many early planning formalisms 
did not allow concurrent actions. 

In scheduling, resources are often classified as single-capacity or multiple-capacity. Single-capacity resources present 
no special difficulty for most planning formalisms. If two actions both require the same resource, the actions conflict 
and cannot be allowed to overlap. In the POCL framework, this can be enforced by adding ordering constraints be- 
tween any such actions that can possibly overlap. 7 In Graphplan, any actions with resource conflicts can be made mu- 
tually exclusive in the plan graph. This is enough to prevent them from being scheduled concurrently. In SAT planning, 
mutual exclusion axioms can be added for every pair of actions with resource conflicts at each time point. These mech- 
anisms are all straightforward, but the presence of resource conflicts may substantially increase the difficulty of finding 
solutions to a problem. 

Multiple-capacity resources present much more difficulty, for two reasons: 1) it is more difficult to identify groups of 
actions with potential resource conflicts and 2) the number of potential ways of resolving the conflict grows exponen- 
tially with the number of actions involved. These are the same issues faced when dealing with multiple-capacity re- 
sources in scheduling problems. As a result, the techniques used in planning bear a close resemblance to techniques 
that have been developed in scheduling. The Oplan planner uses optimistic and pessimistic resource profiles [40] to 
detect and resolve potential resource conflicts. The IxTeT planner [87] uses graph search algorithms to identify mini- 
mum critical sets of actions with conflicting resources. A disjunction of ordering constraints is then added to resolve 
the conflict. 


7. This is quite similar to the mechanism of promotion and demotion for handling threats. 
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4.2 Metric Quantities 

A number of the planning techniques discussed in Section 2. 1 have been extended to allow real-valued or metric quan- 
tities. For the spacecraft example, the direction that the spacecraft is pointing and the fuel remaining are metric quan- 
tities. The difficulty with metric quantities is that when performing an action, the change in one quantity is often a 
mathematical function of the change in another. For example, in a turn operation, the amount of fuel consumed is a 
function of the angular distance between the current and target orientations of the spacecraft. One very simple approach 
is to augment the preconditions and effects of STRIPS operators with equality and inequality constraints involving 
arithmetic and functional expressions. Borrowing notation from Koehler [84], we might describe the turn operation as: 
Turn (?target): 

Preconditions: Pointmg(?direction), ?direction * ?target, 

Fuel > Angle(?direction, ?target)/ConsumptionRate 

Effects: -*Pointing(?direction), Pointing(?target), 

Fuel -= Angle(?direction, ?target)/ConsumptionRate 

The inequality in the precondition specifies that the available fuel must be sufficient to turn the spacecraft through the 
desired angle. The equality in the effects specifies that the fuel will be reduced by that same amount at the conclusion 
of the turn. Note that, in this representation, we have not specified how the fuel changes during the turn, only what it 
must be before and after the turn. For many purposes this representation is sufficient. However, if two actions are al- 
lowed to use or affect the same metric quantity simultaneously, it becomes necessary to describe how the quantity 
changes over the course of the action. Most planners that deal with metric quantities adopt a simple discrete represen- 
tation like that above [136, 84, 139] and do not allow concurrent actions to impact the same metric quantity. A few 
planners, like Penberthy’s Zeno planner [105], use a more detailed representation of how metric quantities change over 
time, but still place restrictions on concurrent actions affecting the same metric quantity. 

The presence of metric quantities and constraints can cause great difficulty in planning. In general, the metric con- 
straints on actions can be nonlinear or can involve derivatives. The simplest approach to dealing with metric quantities 
in planning is to ignore metric constraints until the values of the variables are known precisely. At that time the con- 
straint can be checked to see whether or not it is a satisfied. If not, backtracking occurs. For a turn action, the current 
and target orientations, the fuel, and fuel consumption rate would all have to be known before the inequality precon- 
dition could be checked. This passive approach has been used in a few planning systems [131, 39], but the approach is 
quite weak, because it only detects difficulties late in the planning process and provides no guidance in actually choos- 
ing appropriate actions. 

4.2.1 LP techniques 

Several planners have been constructed that attempt to check the consistency of sets of metric constraints even before 
all variables are known. In general, this can involve arbitrarily difficult mathematical reasoning, so these systems typ- 
ically limit their analysis to the subset of metric constraints consisting of linear equalities and inequalities. Several re- 
cent planners have used LP techniques to manage these constraints [128, 107, 139]. 

One such planner is Zeno [ 107, 105], a POCL planner that continually checks metric constraints using a combination 
of Gaussian elimination for equalities and an incremental Simplex algorithm for inequalities. Zeno deals with nonlin- 
ear constraints by waiting until enough variables are determined that they become linear. To see how Zeno works, con- 
sider the subgoal of pointing at a particular asteroid A37. Zeno would add a turn operation to the plan, would generate 
the subgoal Mnting(?cHrect*on), and would post the constraint Fuel z Angie(?direction ( A37)/Consumpt*onRate. Since Fuel and 
?dlrectlon are not yet known, Zeno cannot yet do anything with this constraint. Instead, Zeno would work on the out- 
standing subgoal Pointing(?dlrect*on). If the spacecraft is initially pointing at Earth, the subgoal could be accomplished 
by requiring that ?directk>n » Earth. Now the inequality constraint reduces to the simple linear inequality Fuel £ number, 
which Zeno would check using its incremental Simplex algorithm. If the required fuel exceeds a maximum constraint. 
Fuel £ MaxFuel, this inconsistency would be detected by Simplex. If no violation occurs. Fuel z number would be intro- 
duced as a subgoal. While Zeno exhibits an impressive collection of technical features and techniques, the planner can 
only handle very small problems. Basically, Zeno suffers from the same kinds of search control difficulties that plague 
less ambitious POCL planners. However, these problems are compounded by the fact that Zeno models continuous 
time and continuous change, which often means that it has a much larger set of possible action choices to investigate. 

A more recent effort to deal with metric quantities is the LPSAT planner [139]. LPSAT uses LP techniques in conjunc- 
tion with SAT planning techniques. The planning problem is encoded as a SAT problem as described in Section 2.1.2, 
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except that metric preconditions and effects are replaced by Boolean trigger variables in the SAT encoding. During 
p annmg, if a trigger variable becomes true, the corresponding metric constraint is passed to an incremental Simplex 
so ver. the Simplex solver reports that the collection of metric constraints is inconsistent, the SAT solver backtracks 
m order to change one or more of the trigger variables. The performance of LPSAT is quite promising, but it shares the 

same disadvantages as the underlying SAT approach: 


Encoding size. The number of variables and constraints can be very large because all possible actions and prop- 
ositions are represented explicitly for each discrete time point. 

Continuous Time. The usual encoding is limited to discrete time and. therefore, cannot deal with actions that 
have varying durations or involve temporal constraints. 


n addition, metric quantities raise the possibility that there may be an infinite number of possible actions. For example, 
consider a refueling action, where the quantity is a continuous function of the duration. This corresponds to an infinite 
number of possible ground actions which simply cannot be handled in the SAT framework. 

4.2.2 ILP Planning 


^ approach t0 handl *ng metric quantities is to represent a planning problem as a mixed integer linear program 
( P). Two papers in this special issue [82. 130] discuss techniques for translating planning problems into ILP prob- 

ems. or die most part, the encodings follow the same form as the encodings of planning problems as satisfiability 
problems. Variables are defined for: ] 

• each possible action at each discrete time step 

• each possible proposition at each discrete time step 

Instead of true/false values, the variables take on 0/1 values, with 1 indicating that the proposition is true or the action 
takes place. In this formulation, the constraints between actions and propositions take the form of linear inequalities 
that can be constructed directly from the clauses used in the SAT planning formalism. For example, an action A having 
an effect E would lead to SAT clauses of the form: 


for each possible time instant t. These would be translated into the following linear inequalities: 

( 1 -A t ) + E t+1 >1 

As wuh SAT planning, similar inequalities would be required for action preconditions, for explanatory frame axioms. 
an r Ct ??J™ UtUaI exclusion - Actions involving metric constraints can be translated in a manner similar to that de- 
scribed for LPSAT. A standard ILP solver can then be used to solve the set of inequality constraints. 

The principle advantages to the ILP approach are: 

Uniform mechanism. All axioms, including metric constraints, translate into equalities and inequalities. A stan- 
dard ILP solver can then be used to solve the set of equations. 

Optimization. It becomes relatively easy to specify optimization criteria, and the ILP solver can do the optimiza- 
tion naturally. 

Dixon and Ginsberg [38] point out that many types of exclusionary constraints can be modeled much more compactly 
using inequalities, although it is not yet clear whether this will have a significant impact for planning problems. 

Of course the ILP approach shares two significant disadvantages with basic SAT planning approaches discussed earli- 

Encoding size. The number of variables and constraints can be very large because all possible actions and prop- 
ositions are represented explicitly for each discrete time point. As mentioned earlier, various tricks can be 

used [43] to reduce the size of the encoding, but the size can still be unmanageable for even modestly sized 
problems. 

Time. The encoding described above is limited to discrete time and, therefore, cannot deal with actions that have 
varying durations or that involve temporal constraints. A causal encoding [80] could conceivably be used 
but this has not yet been attempted. 
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So far. ILP methods are not yet competitive with SAT planning methods, because the LP relaxation step is somewhat 
costly and seems to provide very limited pruning power. However, this area has only begun to receive serious attention. 

4.3 Time 

The STRIPS representation uses a discrete model of time, in which all actions are assumed to be instantaneous (or at 
least of identical duration). However, as we mentioned earlier, the POCL approach to planning does not depend on this 
assumption. In POCL planning, actions can be of arbitrary duration, as long as the conditions under which actions in- 
terfere are well defined. Vere was the first to exploit this in the development of the Devisor system [ 13 I ] 8 . Devisor 
allows actions of arbirary duration, and it also allows both goals and actions to be restricted to particular time windows. 
For a POCL planner, this complicates both the detection and resolution of threats, because the representation and main- 
tenance of the time constraints between actions is more complex. 

Devisor is only the first of several systems [48, 107, 60, 98, 74] that have combined ideas from POCL planning with a 
more general representation of time and action. While we will not discuss the details and differences for all of these 
systems, we will describe some common ideas and a common framework that seem to be emerging from this work. In 
particular, many of these systems use an interval representation for actions and propositions, and they rely on con- 
straint-satisfaction techniques to represent and manage the relationships between intervals. We will refer to this as the 
Constraint- Based Interval (CBI) approach. 

4.3.1 The Interval Representation 

The idea of using an interval representation of time was first introduced and popularized in AI by James Allen [3J. Rath- 
er than describing the world by specifying what facts are true in discrete time slices or states, Allen describes the world 
by asserting that propositions hold over intervals of time. Similarly, actions and events are described as taking place 
over intervals. Constraints between intervals describe the relationships between actions (or events) and the propositions 
that must hold before and after. In our spacecraft example, we could say that the spacecraft is turning towards asteroid 
A37 over a particular interval I as Tuming(A37) ( . Similarly, we would say that the spacecraft is pointing at A37 over a par- 
ticular interval J as Pointing(A37)j. Following Joslin [76, 77], we will use the term temporally qualified assertion ( TQA ) 
to refer to a proposition, action, or event taking place over an interval. 

Allen introduced a set of seven basic interval relations (and their inverses) that can be used to describe the relationships 
between intervals. These are summarized pictorial ly in Figure 10. Using these interval relationships we can describe 
how actions (or events) affect the world. To see how this can be done, recall the simple set of STRIPS operators intro- 
duced in Section 2.1 for turning, calibrating, and taking images. 

Turn (Ttarget): 

Preconditions: Pointing(?direction), ?direction * ?target 

Effects: -»Pcnnting(?direction) f Pointing(?target) 

Calibrate (?instnjment): 

Preconditions: Status(?instrument, On), Calibration-Target(?target), Pointing(?target) 

Effects: ^Status(?instrument, On), Status(?instrument. Calibrated) 

Take I mage ( Ttarget, ? instrument): 

Precondmons: Status(?lnstrument, calibrated), pointlng(?target) 

Effects: lmage(?target) 

Using the interval representation, the intervals and constraints implied by the Takelmage operator would be: 
Takelmage(?target, ?instrument)^ -> 3P {Status(?instrument, Calibrated)p a Contains(R A)} 

& 3Q {Pointlng(?target)Q a Contains(Q. A)} 

& 3R {lmage(?target)p a Meets(A, R)} 

This axiom states that if there is an action Takelmage(?target, instrument) over the interval A, then Status(?lnatrument, Cal- 
ibrated) and Pointing(?target) must hold for intervals P and Q containing the Takelmage action, and lmage(?target) will hold 
over some interval R immediately following the action. This is depicted graphically in Figure 11. 


8. Technically, Devisor is based on Nonlin [127], which was a precursor to the modem notion of POCL planning. 
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A before B 


A 



A meets B 



A overlaps B 



A starts B 


A contains B 


A = B 


A ends B 



Figure 10: Graphical depiction of Allen’s basic interval relationships. 


Pointing(?target) | 

w Contains 4 
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Takelmage(?target, ?instr) | ►» ] lroage(?target)“] 
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Status(?instr, Calibrated) 1 


Figure 11: Graphical depiction of the interval constraints for Takelmage. 

The interval representation permits much more flexibility and precision in specifying temporal relationships than is 
possible with simple STRIPS operators. For example, we can specify that a precondition only need hold for the first 
part of an action, or that some temporary condition will hold during the action itself. We can also specify that two ac- 
tions must be performed simultaneously in order to achieve some condition, or that there are particular time constraints 
on goals, actions, or events. 

Although quantified logical expressions like that given above are very general, they are also cumbersome and some- 
what difficult to understand. Muscettola [98, 74] has developed a shorthand for specifying such axioms that makes in- 
terval existence implicit For example, the Takelmage axiom above can be specified as: 

Takelmage (Ttarget, Tlnstrument) contalned-by Status(?lnstrument, Calibrated) 

contalned-by Rointlng(7target) 

meets lmage(?target) 

This is interpreted to mean that if an interval exists in which a Takelmage action occurs, then other intervals exist in 
which Status(?in8tnjment. Calibrated), Pointlng(?target), and lmage(?target) hold, and these intervals have the indicated rela- 
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tionships with the interval for Takelmage . 9 Using this notation, the constraints for the Turn and Calibrate operators would 
be: 


Turn (?target) met-by 

meets 

Calibrate (?instrument) met-by 

contained-by 
contained- by 
meets 

4.3.2 The CBI Planning Algorithm 


Pointing(?direction) 

Pointing(?target) 

Status(?instrument, On) 
CalibratlonTarget(?target) 
Pointing(?target) 
Status(?instrument, Calibrated) 


With the interval representation, planning is accomplished in a way very similar to POCL planning (discussed in Sec- 
tion 2. 1 .2). The planner works backwards from the goals, adding new action TQAs to the plan, which in turn introduce 
new subgoal TQAs by virtue of the interval constraints. Throughout this process, additional ordering constraints may 
be needed between TQAs in order to eliminate conflicts. If there are no outstanding subgoals and no conflicts remain, 
the plan is complete. Backtracking occurs if there is no way to achieve a particular subgoal or if there is no way to 
satisfy the constraints governing the intervals in a partial plan. A sketch of the algorithm is shown in Figure 12. 

Expand(TQAs, constraints) 

1. If the constraints are inconsistent, fail 

2. If all TQAs have causal explanations, retum(TQAs, constraints) 

3. Select age TQAs with no causal explanation 

4. Choose: 

Choose another p e TQAs such that g can be coalesced with p under constraints C 
Expand( TQAs-g, constraints u c > 

Choose an action that would provide a causal explanation for g 
Let A be a new TQA for the action, 
and let R be the set of new TQAs implied by the axioms for A 
Let C be the constraints between A and R 
Expand( TQAs u(A)uR, constraints u C) 

Figure 12: A non-determini Stic CBI planning algorithm. The algorithm recursively finds a TQA without a causal 
explanation and either coalesces that interval with an existing interval or introduces a new interval into the plan that 
will provide a causal explanation for the interval. Statements beginning with choose are backtrack points. 

To see how this algorithm works, consider our simple spacecraft problem where the goal is to obtain an image of as- 
teroid A37. Initially, the model contains intervals for each of the initial conditions, as well as intervals for each of the 
goal conditions. For our simple spacecraft example, this might look like Figure 13. There are, as yet, no constraints on 
when the intervals for the initial conditions (Pointing(Earth), Status(Camera1 f Off). Status(Camera2, On), and CaitorationTar- 
get(T17)) might end. Likewise, there are no constraints on when the interval corresponding to the goal condition Im- 
age(A37) will start (except that it must be after the Past interval). There is no causal explanation for fmage(A37), so that 
TQA is selected by the algorithm. In our example, there is only one way of producing this TQA, and that is to introduce 
a Takelmage action. According to the constraints, Takelmage must be contained by an interval in which the spacecraft is 
pointing at the target asteroid and by an interval in which the (as yet unspecified) instrument is calibrated, so these two 
intervals are introduced into the plan along with the corresponding constraints. We can also infer at this stage that the 


9. There is a subtle difference between the relations in the shorthand notation and the underlying interval relations of the same 
name. The interval relations can be inverted (i.e., Contains(l, J) implies Contalned-by(J. I)), but we cannot invert the relations in the 
shorthand notation. For example, stating that: 

Status(?lnstrument, Calibrated) contains Takelmage (Ttarget, ? instrument) * 

would be wrong; just because an interval exists in which Status(?lnstrument, Calibrated) does not mean that a Takelmage action 
must occur Thus, the shorthand is directional, whereas the underlying interval relations arc not. 
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Figure 13: Initial partial plan tor the spacecraft problem. 

interval for Pointing(A37) must come after the interval for Pointing(Earth), because Fointing(Earth) meets Past, and because 
the spacecraft cannot point in more than one direction at once. The resulting plan is shown in Figure 14. 



Figure 14: Partial plan for the spacecraft problem after adding a Takelmage action. 

There are now two TQAs in the plan without causal explanations. To achieve Pointing(A37) a turn step must be intro- 
duced, and to achieve Status(?instr, Calibrated) a Calibrate step must be introduced. The constraints associated with these 
steps cause the introduction of several more TQAs as shown in Figure 15. At this stage, we cannot yet infer anything 



Figure 15: Partial plan for the spacecraft problem after adding a Turn action. 


about the temporal relationships between Pointing(?caJtarget) and any of the other three pointing TQAs, because we have 
not yet ruled out A37, Earth, or ?dlrection as possible calibration targets. However, we can choose to coalesce several 
pairs of intervals at this stage. In particular, we can coalesce both CalibrationTarget(?caltarget) and Status(?instr, On) with 
initial condition intervals, and we can coalesce Pointing(?caltarget) with Pointing(?dlrection). The resulting partial plan is 
shown in Figure 16. Only Pointing(Tl7) remains without a causal explanation, and it can only be achieved by introducing 
another Turn step. After adding this step and the intervals required by the constraints on turn, the remaining unexplained 
pointing interval can be coalesced with the initial conditions to give the finished plan shown in Figure 17. 


For this simple example, the order in which we made decisions was quite lucky. After the second step, we could have 
coalesced Fointing(?caltarget) with Pointlng(A37) or Pointing(Eartb). Likewise, we could have coalesced PointlnQ(?dir 0 Ctton) 
with Roiniing(Earth). None of these possibilities would have worked, so the planner would have ultimately been forced 
to backtrack and try different alternatives. 
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Figure 16: Partial plan tor the spacecraft problem after coalescing compatible intervals. 



So far, we have presented CBI planning in a way that emphasizes the similarities with POCL planning, but there are 
some important differences. For CBI planners the temporal relationships and reasoning are often more complex. As a 
result, CBI planners typically make use of an underlying constraint network to keep track of the TQAs and constraints 
in a plan. For each interval in the plan, two variables are introduced into the constraint network, corresponding to the 
beginning and ending points of the interval. Constraints on the intervals then translate into simple equality and inequal- 
ity constraints between end points. Interval durations translate into constraints on the distance between start and end 
points. Inference and consistency checking in this constraint network can often be accomplished using fast variations 
of arc-consistency, such as Simple Temporal Propagation [37]. 

In addition to handling temporal relationships, the constraint network can also be used to explicitly encode other choic- 
es, such as the possible values for un instantiated parameters. In our example above, the variables ?lnstr, ?caltarget, and 
?direction could appear explicitly in the network with values corresponding to the possible instantiations. If variable typ- 
ing information is provided, this mechanism can be very effective in reducing the possibilities for coalescing intervals 
and introducing new TQAs. The constraint network can also be used to explicitly encode alternative choices for achiev- 
ing a particular TQ A. For example, if there were more than one possible way of orienting the spacecraft in a particular 
direction, an explicit variable could be introduced that corresponds to that choice, and constraint propagation could 
potentially be used to help refine this set of choices. 

CBI planners can be viewed as dynamic constraint satisfaction engines - the planner alternately adds new TQAs and 
constraints to the network, then uses constraint satisfaction techniques to propagate the effects of those constraints and 
to check for consistency. This generality is the strength of the CBI approach: 

1. The interval representation can be used to express a wide variety of temporal constraints on actions, events, and 
fluents. 

2. The underlying representation of partial plans as a constraint network provides a single uniform mechanism for 
doing temporal inference, resolving conflicts between TQAs, and refining parameter choices for actions. 
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As a result, the planning algorithm is particularly simple, and much of the control of the algorithm takes the form of 
variable and value ordering in the underlying constraint network. In short, the interval representation and dynamic con- 
straint satisfaction provides a clean, expressive framework for tackling planning problems. This framework also seems 
to mesh nicely with the CSP approach to scheduling, discussed in Section 3. In particular, we believe that edge-finding 
techniques could be incorporated into CBl planners to narrow the possible times for intervals. Similarly, we believe 

that texture-based heuristics could be incorporated to help guide the process of resolving resource conflicts between 
intervals. 

The biggest potential drawback of CBl planning is performance. To date, CBl planning has not been studied widely, 
an it is not as well understood as most other planning techniques. Serious comparisons with other planning techniques 
ave not been carried out, partly because CBl techniques have been aimed at problems that require a much richer rep- 
resentation of time and action. From a theoretical point of view, it is not clear why CBl planners would perform any 
better than POCL planners (which are currently outperformed by Graphplan and SATPLAN). As we illustrated in the 
example above, a CBl planner goes through the same decisions and steps as a POCL planner, yet a POCL planner only 

keeps track of the propositions that are relevant between actions (causal links) rather than the state of all affected prop- 
ositions over time. K v 

One might argue that using an underlying constraint network and constraint satisfaction techniques allows a CBl plan- 
ner to deal with temporal constraints, variable bindings, and alternative choices more efficiently. However, this seems 
unlikely to have a significant impact on performance, because the same fundamental set of choices must be considered 
by the planner. Similarly, one might argue that the use of general constraint propagation allows a CBl planner to weed 
out more possibilities that lead to dead-end plans. This also seems unlikely, since there is little evidence that extreme 
least commitment strategies (such as those suggested by Joslin [76, 77]) actually pay off. 

Several planning systems have been built that use a CBl approach, including Allen’s Trains planner [48], Joslin’s Des- 
carte [76, 77], the HSTS/Remote Agent (RA) planner [98, 74. 75], IxTeT [60. 87], and (to a large extent) Zeno [107], 
e focus of Allen s work has been primarily on formal properties and on mixed initiative planning, while Joslin’s 
p anner was intended as a demonstration of how far a least-commitment approach to planning could be taken. As we 
mentioned earlier, Zeno concentrated on metric quantities rather than on efficient temporal reasoning. None of these 
planners have been extensively tested on real world or benchmark problems. 

The two most practical CBl planners are IxTeT and the HSTS/RA planner, both of which have been applied to space 
re ated problems. The HSTS/RA planner was used to generate plans during the Remote Agent Experiment [99] on- 
board the NASA Deep Space One spacecraft. During the experiment, the planner successfully generated complex plans 
that included turns, observations, navigation, thrusting, communications, and other aspects of spacecraft operations 
while taking into account limited resources, task durations, and time limits. Unfortunately, in order to achieve this per- 
ormance, the search process had to be carefully controlled with problem-dependent, hand-crafted heuristics. 

There are two unique characteristics of the HSTS/RA planner that may have also contributed to its success. First of all, 
the planner makes no distinction between propositions holding over intervals and actions taking place over intervals 
Instead of working on proposition intervals that do not have a causal explanation, the planner works on intervals whose 
constraints have not yet been satisfied. For this strategy to work, explicit constraints must be provided that indicate the 
possible ways of achieving each proposition. (These are often referred to as explanatory frame axioms.) For example, 

a pointing proposition must either be true initially or it must be achieved by turning. As a result, we would need to state 
that: 

Pointing(?target) met-by Past 
or 

met-by 7b rn (Ttarget) 

Having supplied this constraint, new turning intervals would be introduced automatically whenever a pointing interval 
could not be coalesced with an existing pointing interval. The potential advantage of this approach is that it allows the 

designer to carefully control which actions are considered for given subgoals in much the same way as in an HTN olan- 
ner. 

A second characteristic of the HSTS/RA planner that may have contributed to its success is that it represents the world 
using vanable/value assignments rather than propositions. In our spacecraft example, there would be one variable or 
timeline, for each camera’s status, for the direction in which the spacecraft is pointing, and for the contents of the image 
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buffer The constraints are therefore specified in terms of the intervals over which variables take on particular values. 
For example, the constraints on a turning activity could be specified as: 10 

Pointing=Turn (?target) met-by Pointing* ?direction 

meets Pointing*? target 

In this case, there is an interval in which the Pointing variable has the value ?direction, followed by an interval in which 
the Pointing value is in transition, followed by an interval in which the Pointing value is ?target. This kind of variable/ 
value or functional representation has a subtle advantage over the propositional representation: no explicit axioms or 
reasoning steps are required to recognize that Pointing cannot take on two values at once. Recognizing that two intervals 
are in conflict happens automatically as a result of the fact that different valued intervals can never overlap for any given 
variable. 1 1 

5 Conclusions 

If we consider the different methods of planning discussed in this paper, fundamentally, there are three different ways 
that planning has been done: 

Stratified P&S. The approach usually taken in the scheduling community is to separate planning from schedul- 
ing. Action choices are made first, then the resulting scheduling problem is solved. For example, in scheduling 
problems where there are process alternatives for certain steps, these decisions are usually made first, then the 
resulting scheduling problem is solved. If there is no solution, or the solution is not good enough, an alternative 
set of choices is made and the scheduling problem is solved again. 

Interleaved P&S. The approach taken in POCL planning (and many other classical planning techniques) allows 
interleaving of action choices with ordering decisions. Typically an action choice is made, and conflicts are 
resolved with other actions by imposing ordering constraints on the actions. Traditionally, ordering decisions for 
these planners have been relatively simple. The CBI approach uses a representation much closer to that used in 
scheduling and offers hints that powerful scheduling techniques and heuristics could be integrated into such a 
framework. 

Homogeneous P&S. The approach taken in SAT planning and ILP planning is to turn a planning problem into a 
scheduling problem. In the resulting scheduling problem there is no distinction between action choices and order- 
ing decisions and the decisions are made in a single homogeneous manner. To do this, the planning problem must 
be bounded in some way, usually by the number of discrete time steps needed. 

Table I summarizes our assessment of four of the most promising planning approaches. Each entry in the table indi- 
cates how difficult we think it will be to extend the approach to handle the indicated issue in an effective manner. If 
there has already been work on an issue, we have included a reference. However, the presence of a reference does not 
necessarily mean that the issue has been completely or adequately addressed. For Graphpian, performance is very 
good. There has been some work on extending the approach to allow metric quantities [85] and continuous time [125], 
but this work is still preliminary and of limited scope. These extensions also seem to be complex and difficult to engi- 
neer. As a result, we believe it will be relatively hard to extend Graphlan to fully cover any of these issues. SAT planners 
have also exhibited very good performance. The work on LPSAT [139] shows that the framework can be extended to 
allow fairly general reasoning with metric quantities. We expect that multiple-capacity resources will yield to the same 
kinds of techniques used in IxTeT, although this has not yet been attempted. Branch-and-bound can be used to do op- 
timization with systematic solvers, but when combined with metric quantities, optimization is not as straightforward 
or seamless as in an ILP framework. A big problem for both SAT and ILP planning is continuous time - a discrete time 
encoding will not work, and causal encodings have not proven very practical. The ILP approach is attractive because 
of the natural ability to handle multiple-capacity resources, metric quantities, and optimization. However, performance 
is not yet up to the standards of Graphpian or SAT planning techniques. CBI planning is attractive because of its ability 
to handle continuous time. In particular, the HSTS/RA planner [75] exhibits efficient handling of complex temporal 
constraints. Zeno [107] has shown that general metric quantities can be handled within the CBI framework. Zeno had 
serious performance limitations, but we believe that this was related to its handling of time, rather than to the use of 


10. For pedagogical reasons we have taken liberties with the actual notation and algorithm used in the HSTS/RA planner. 

1 1. Peot [109] and Geffner [22] have also argued that there are significant advantages to a functional representation. 
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L p techniques for handling metric quantities. Similarly. IxTeT f87J has shown that multiple capacity resources can be 
handled efficiently within the CBI framework. The primary difficulty forCBI techniques has been in controlling the 
underlying goal-directed planning search. We speculate that CBI planners need stronger heuristic guidance - the kind 
Of guidance provided by a planning graph or by the kind of automated analysis used by Geffner for FSS planning [23]. 



For many years there has been a wide gulf between planning technology and scheduling technology. Work on planning 
has concentrated on problems involving many levels of action choice, where the relationships between actions are var- 
ied and complex. Unfortunately, most work on classical planning has assumed a very simple, discrete model of time 
and has ignored issues of resources, metric quantities, and optimization. These issues are critical in most scheduling 
problems and, we would argue, in most realistic planning problems, such as our spacecraft problem. As a result, much 
o the work on classical planning has not applied to realistic scheduling problems or to our spacecraft problem. 

Fortunately, there are some encouraging signs that this situation may be changing. Much of the work described in Sec- 
tion 4 is promising. There are still many speculative entries in Table 1, but there is hope for the ambitious spacecraft, 
even though much work remains to be done. 
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